[improve][broker] optimize namespaceBundle validation to fix single-thread 100% CPU during unloading entire namespaces.#25626
Conversation
|
@AnonHxy @BewareMyPower @dao-jun @lhotari |
Please fix any license or checkstyle issues first so that we can make progress with CI. You can check locally with this command: ./gradlew rat spotlessCheck checkstyleMain checkstyleTest |
fixed.😊 |
|
Thanks for the contribution @zhaizhibo |
Local Claude Code reviewThese comments come from a local Claude Code review run by @lhotari, not from a maintainer review. Treat them as suggestions to investigate, not blocking decisions. SummaryThe optimization is correct and the algorithmic improvement (O(N²) → O(N) for full-namespace operations) is real. The cache invalidation story upstream is sound — A few items worth a closer look before merge: 1. Dead
|
|
@zhaizhibo Please check the above comment from a local Claude Code review. I think that breaking the protected method API is ok since the PulsarWebResource class isn't meant to be extended by external code. |
|
@lhotari |
9e11e84 to
1ea18a3
Compare
Motivation
When validating namespace bundle ranges via
validateNamespaceBundleRange, the method callsNamespaceBundleFactory.getBundles(NamespaceName, BundlesData)which constructs a newNamespaceBundlesobject every time. This object contains all bundles for the entire namespace, and its construction involves expensive string formatting operations (e.g.,toString()for each bundle boundary).For a namespace with N bundles, operations like
unloadorclearBacklogthat iterate over all bundles will callvalidateNamespaceBundleRangefor each one, resulting in O(N²) total construction work. For example, a namespace with 4000 bundles would require ~16,000,000 string operations during a single unload, as each of the 4000 bundle validations redundantly reconstructs all 4000 bundle objects.The
NamespaceBundleFactoryalready maintains a cache (bundlesCache) viagetBundlesAsync(NamespaceName)that computesNamespaceBundlesonce and reuses it. Using this cache eliminates the repeated O(N) construction per validation, reducing the total work from O(N²) to O(N).Modifications
validateNamespaceBundleRangeAsync(NamespaceName, String)that usesgetBundlesAsync()(cached) instead ofgetBundles(NamespaceName, BundlesData)(re-constructed each call).isBundleOwnedByAnyBrokerto usevalidateNamespaceBundleRangeAsync, removing theBundlesDataparameter since bundles are now fetched from cache rather than passed by the caller.validateNamespaceBundleRangewith the corresponding asynchronous callvalidateNamespaceBundleRangeAsync.Verifying this change
Make sure that the change passes the CI checks.
Dependencies (add or upgrade a dependency)
The public API
The schema
The default values of configurations
The threading model
The binary protocol
The REST endpoints
The admin CLI options
The metrics
Anything that affects deployment